Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix #23 and improvements as suggested in #48 #50

Merged
merged 47 commits into from
Jan 9, 2025

Conversation

arizon-dread
Copy link
Collaborator

@arizon-dread arizon-dread commented Jun 19, 2024

Bugfix and suggestions for improvement implemented

Fixes #23 by checking for error from clamd.ScanStream, which closes the connection if the file size is exceeded. The error was previously ignored. This PR includes code that assumes that a closed connection is caused by file size exceeded. Unfortunately, there is no way to detect the ^INSTREAM: file size limit exceeded error from the clamd process in the api. A custom clamd.ScanResult is created inside the error handling if statement, to handle the response in the same switch/case logic as other responses. This is also handled in the scanHandlerBody func the same way, so the client actually get a response.

Fixes #48 by creating a new endpoint (/v2/scan) and a scanResult struct that contains status, description and httpStatus (httpStatus is ignored in the json annotation, and only used in the code logic). An array of all scanned files (as []scanResult) is then marshalled to json, creating a proper json response for one to many files, returning an array of json objects to the client. The old /scan endpoint will also use this response (which is formatted the same way as before) but it will not be returned in an array, but as before, one to many json objects without proper json array structure, to keep previous behavior intact. Although, deprecation and link headers indicating that there's a new endpoint available, is returned from the old endpoint.

Previously, the first file's status would be the http status of the entire response, this PR includes code that will always return a 406 http status if any file contains a virus, and only return a 200 OK status if all files are clean, for both the old and new endpoint. I figured this was a bug.

I also added a prometheus metrics counter that increments on each found virus.

I also added go.mod and go.sum to use go modules instead of vendor directory etc.

@arizon-dread arizon-dread marked this pull request as draft June 19, 2024 09:00
@arizon-dread arizon-dread marked this pull request as ready for review June 19, 2024 09:08
@arizon-dread
Copy link
Collaborator Author

arizon-dread commented Oct 17, 2024

I got this working on my local machine yesterday after merging the original repo's master branch into my fork's scan-v2, successfully tested all the endpoints. Then I built the docker image and when starting that up, it can't parse the clamd.conf nor the freshclam.conf. I looked into it and it didn't look off from what I could see, although the "Foreground" setting seems to take "yes" as the value rather than "true", but changing that in the sed command in the dockerfile did nothing to solve the issue, I will look into it further tonight.

@arizon-dread
Copy link
Collaborator Author

@davosian Do you have any insights into this issue parsing the config files? It seems lilke the version that gets installed in alpine 3.20 is 1.22.r0 which is a little bit weird considering it's not a LTS release, but a release candidate.

@davosian
Copy link
Contributor

Hi @arizon-dread,

I finally got around going through the latest PR so that is about time get your changes merged into the project. Since you were kind enough to review PR #59, I was wondering whether your tests included the changes from this PR as well?

Great to hear about your openness to support the project! As you can tell my reactions are on the slow end (which I am hoping to improve) so any support I can get to keep this project active is a very welcome change.

I also just checked the version included in Alpine Edge and even there it is a release candidate (1.3.2-r0). Not sure though whether this is normal for Alpine's Edge versions.

@arizon-dread
Copy link
Collaborator Author

arizon-dread commented Nov 13, 2024

Hi @davosian

My tests in #59 did not include my code in this PR, I tested the PR's code as it was indented to be merged (checked out the branch from @christianbumann's fork). I have now merged the master branch into my scan-v2 after the PR merge of #59 and built it. Building and starting the container works but it seems I need to wait until tomorrow before new signatures are published so I can verify the version update, like @christianbumann had to do when he was working with #59, so I'll get back with results tomorrow.

Yes, the Alpine repos seems to be a bit off with the clamav versions, if you check the official clamav download page: clamav, they promote 1.4.1 LTS and 1.0.7 LTS. The 1.4.1 version is also the version on the official docker hub repo clamav dockerhub. That's why I was contemplating on if we should switch the base image to the official clamav image to get a LTS release of it instead, but if we decide to do that, I think it needs to be it's own PR from a clean fork from master. Not mixed with other stuff like the changes I have made in this PR. In that case, we need to make sure to look through the config even more thoroughly so they haven't made breaking changes or changed syntax, which would break our sed approach when building this image.

I'll get back tomorrow with version update results!

@arizon-dread
Copy link
Collaborator Author

It seems to be working as expected. I also fixed the centos.Dockerfile which was very outdated and wouldn't even build. Centos:stream8 is EOL and the repos are offline so I upgraded to stream9 and modified the dockerfile to create the folder structure used in the standard docker file to get it up and running (to match the entrypoint.sh script). The centos base image has clamav 1.0.7 in the repos so the version is slightly older.

@arizon-dread arizon-dread mentioned this pull request Jan 8, 2025
Copy link
Contributor

@davosian davosian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that the prometheus metric no_of_found_viruses only considers the new /v2/scan endpoint. Is this by design?

Copy link
Contributor

@davosian davosian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

@arizon-dread
Copy link
Collaborator Author

@davosian

I found that the prometheus metric no_of_found_viruses only considers the new /v2/scan endpoint. Is this by design?

I have just not thought it through completely, of course, every virus hit should be recorded as an uptick, regardsless of which endpoint finds it. I initially had an Idea to build a middleware that intercepts all responses on the way back to the client and if the status code is 406, increment the prometheus counter, but after having looked into this I've realized it's not as straight forward as i initially thought. Creating a middleware in go is fairly easy but it seems that the StatusCode field is not an accessible field on the http.ResponseWriter when it's returning to the client. I would say that since we already have another PR #51, suggesting that we uniform responses and utilize the same func for all the endpoints to determine the response status. My suggestion is that we centralize the prometheus virus ticker logic there instead. I have already started going in that direction with my getResponseStatus func. The logical way should be to create the uptick if the response status 406 is registered for the response. Let me see what I can do on this.

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

These seems to be the default values in the clamd.conf file out of the box. It looks like that on my machine too. I'm not sure why though.

@arizon-dread
Copy link
Collaborator Author

arizon-dread commented Jan 8, 2025

I have done some late night refactoring to be more uniform in how it works with traversing the statuses, setting the correct http status and incrementing the virus found ticker. I will not be changing the response status of the scanPathHandler, it has always returned 200, I only added a check for each virus found, it will also increment the counter.

The /scanHandlerBody, /scan and /v2/scan now utilize the same funcs for translating clam.RES_${status} to http status code, and the same func for evaluating statuses and then writing the statuscode header in the same manner. This is also where the prometheus incrementation happens for those tree endpoints. Hopefully a bit more uniform than before. I have tried to keep the backwards compatibility intact to not break stuff.

@davosian
Copy link
Contributor

davosian commented Jan 9, 2025

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

These seems to be the default values in the clamd.conf file out of the box. It looks like that on my machine too. I'm not sure why though.

I checked again the defaults and usage in clamav itself:

# This option sets the maximum amount of data to be scanned for each input
# file. Archives and other containers are recursively extracted and scanned
# up to this value.
# Value of 0 disables the limit
# Note: disabling this limit or setting it too high may result in severe damage
# to the system.
# Default: 400M
#MaxScanSize 1000M

# Files larger than this limit won't be scanned. Affects the input file itself
# as well as files contained inside it (when the input file is an archive, a
# document or some other kind of container).
# Value of 0 disables the limit.
# Note: disabling this limit or setting it too high may result in severe damage
# to the system.
# Technical design limitations prevent ClamAV from scanning files greater than
# 2 GB at this time.
# Default: 100M
#MaxFileSize 400M

I am still not sure why MaxScanSize defaults to a higher value than MaxFileSize, but my take on it is that MaxScanSize targets compressed formats like archives and MaxFileSize then limits the scan size of each file contained in it.

In other words: as you already pointed out @arizon-dread , we are applying the same logic as the clamav project itself and therefore we should stick to it.

Copy link
Contributor

@davosian davosian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some more testing around the old and new endpoints. The service responded as expected and the counter behaved accordingly, resetting with a fresh start of the container, honoring the MAX_FILE_SIZE and counting up for each virus found independently of the endpoint used. Therefore, I would say, we are ready to merge this PR. 💪

@davosian davosian merged commit 82744fe into ajilach:master Jan 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants